Dataset statistics
| Dataset A | Dataset B | |
|---|---|---|
| Number of variables | 12 | 12 |
| Number of observations | 446 | 446 |
| Missing cells | 446 | 431 |
| Missing cells (%) | 8.3% | 8.1% |
| Duplicate rows | 0 | 0 |
| Duplicate rows (%) | 0.0% | 0.0% |
| Total size in memory | 45.3 KiB | 45.3 KiB |
| Average record size in memory | 104.0 B | 104.0 B |
Variable types
| Dataset A | Dataset B | |
|---|---|---|
| Numeric | 5 | 5 |
| Categorical | 4 | 4 |
| Text | 3 | 3 |
| Dataset A | Dataset B | |
|---|---|---|
Sex is highly overall correlated with Survived | Sex is highly overall correlated with Survived | High Correlation |
Survived is highly overall correlated with Sex | Survived is highly overall correlated with Sex | High Correlation |
Age has 97 (21.7%) missing values | Age has 90 (20.2%) missing values | Missing |
Cabin has 348 (78.0%) missing values | Cabin has 340 (76.2%) missing values | Missing |
PassengerId has unique values | PassengerId has unique values | Unique |
Name has unique values | Name has unique values | Unique |
SibSp has 299 (67.0%) zeros | SibSp has 315 (70.6%) zeros | Zeros |
Parch has 342 (76.7%) zeros | Parch has 327 (73.3%) zeros | Zeros |
Fare has 10 (2.2%) zeros | Fare has 8 (1.8%) zeros | Zeros |
Reproduction
| Dataset A | Dataset B | |
|---|---|---|
| Analysis started | 2024-07-15 20:42:40.610096 | 2024-07-15 20:42:44.823775 |
| Analysis finished | 2024-07-15 20:42:44.822629 | 2024-07-15 20:42:49.043833 |
| Duration | 4.21 seconds | 4.22 seconds |
| Software version | ydata-profiling v0.0.dev0 | ydata-profiling v0.0.dev0 |
| Download configuration | config.json | config.json |
PassengerId
Real number (ℝ)
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 446 | 446 |
| Distinct (%) | 100.0% | 100.0% |
| Missing | 0 | 0 |
| Missing (%) | 0.0% | 0.0% |
| Infinite | 0 | 0 |
| Infinite (%) | 0.0% | 0.0% |
| Mean | 441.5 | 460.98206 |
| Dataset A | Dataset B | |
|---|---|---|
| Minimum | 1 | 1 |
| Maximum | 891 | 891 |
| Zeros | 0 | 0 |
| Zeros (%) | 0.0% | 0.0% |
| Negative | 0 | 0 |
| Negative (%) | 0.0% | 0.0% |
| Memory size | 7.0 KiB | 7.0 KiB |
Quantile statistics
| Dataset A | Dataset B | |
|---|---|---|
| Minimum | 1 | 1 |
| 5-th percentile | 42.25 | 62.25 |
| Q1 | 226.25 | 230.25 |
| median | 426.5 | 472 |
| Q3 | 673.5 | 684.75 |
| 95-th percentile | 843.5 | 854.75 |
| Maximum | 891 | 891 |
| Range | 890 | 890 |
| Interquartile range (IQR) | 447.25 | 454.5 |
Descriptive statistics
| Dataset A | Dataset B | |
|---|---|---|
| Standard deviation | 258.85613 | 256.75278 |
| Coefficient of variation (CV) | 0.58631061 | 0.55696914 |
| Kurtosis | -1.2048005 | -1.2100875 |
| Mean | 441.5 | 460.98206 |
| Median Absolute Deviation (MAD) | 222.5 | 224 |
| Skewness | 0.016159286 | -0.06852953 |
| Sum | 196909 | 205598 |
| Variance | 67006.498 | 65921.991 |
| Monotonicity | Not monotonic | Not monotonic |
| Value | Count | Frequency (%) |
| 65 | 1 | 0.2% |
| 226 | 1 | 0.2% |
| 683 | 1 | 0.2% |
| 328 | 1 | 0.2% |
| 190 | 1 | 0.2% |
| 611 | 1 | 0.2% |
| 507 | 1 | 0.2% |
| 234 | 1 | 0.2% |
| 136 | 1 | 0.2% |
| 201 | 1 | 0.2% |
| Other values (436) | 436 |
| Value | Count | Frequency (%) |
| 540 | 1 | 0.2% |
| 108 | 1 | 0.2% |
| 500 | 1 | 0.2% |
| 669 | 1 | 0.2% |
| 201 | 1 | 0.2% |
| 819 | 1 | 0.2% |
| 394 | 1 | 0.2% |
| 604 | 1 | 0.2% |
| 229 | 1 | 0.2% |
| 297 | 1 | 0.2% |
| Other values (436) | 436 |
| Value | Count | Frequency (%) |
| 1 | 1 | |
| 2 | 1 | |
| 3 | 1 | |
| 5 | 1 | |
| 8 | 1 | |
| 9 | 1 | |
| 12 | 1 | |
| 13 | 1 | |
| 15 | 1 | |
| 17 | 1 |
| Value | Count | Frequency (%) |
| 1 | 1 | |
| 3 | 1 | |
| 4 | 1 | |
| 10 | 1 | |
| 15 | 1 | |
| 16 | 1 | |
| 18 | 1 | |
| 26 | 1 | |
| 28 | 1 | |
| 30 | 1 |
| Value | Count | Frequency (%) |
| 1 | 1 | |
| 3 | 1 | |
| 4 | 1 | |
| 10 | 1 | |
| 15 | 1 | |
| 16 | 1 | |
| 18 | 1 | |
| 26 | 1 | |
| 28 | 1 | |
| 30 | 1 |
| Value | Count | Frequency (%) |
| 1 | 1 | |
| 2 | 1 | |
| 3 | 1 | |
| 5 | 1 | |
| 8 | 1 | |
| 9 | 1 | |
| 12 | 1 | |
| 13 | 1 | |
| 15 | 1 | |
| 17 | 1 |
Survived
Categorical
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 2 | 2 |
| Distinct (%) | 0.4% | 0.4% |
| Missing | 0 | 0 |
| Missing (%) | 0.0% | 0.0% |
| Memory size | 7.0 KiB | 7.0 KiB |
| 0 | |
|---|---|
| 1 |
| 0 | |
|---|---|
| 1 |
Length
| Dataset A | Dataset B | |
|---|---|---|
| Max length | 1 | 1 |
| Median length | 1 | 1 |
| Mean length | 1 | 1 |
| Min length | 1 | 1 |
Characters and Unicode
| Dataset A | Dataset B | |
|---|---|---|
| Total characters | 446 | 446 |
| Distinct characters | 2 | 2 |
| Distinct categories | 1 | 1 ? |
| Distinct scripts | 1 | 1 ? |
| Distinct blocks | 1 | 1 ? |
Unique
| Dataset A | Dataset B | |
|---|---|---|
| Unique | 0 | 0 ? |
| Unique (%) | 0.0% | 0.0% |
Sample
| Dataset A | Dataset B | |
|---|---|---|
| 1st row | 0 | 1 |
| 2nd row | 1 | 1 |
| 3rd row | 0 | 0 |
| 4th row | 1 | 1 |
| 5th row | 0 | 0 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 272 | |
| 1 | 174 |
| Value | Count | Frequency (%) |
| 0 | 268 | |
| 1 | 178 |
Length
Common Values (Plot)
Dataset A
Dataset B
| Value | Count | Frequency (%) |
| 0 | 272 | |
| 1 | 174 |
| Value | Count | Frequency (%) |
| 0 | 268 | |
| 1 | 178 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 272 | |
| 1 | 174 |
| Value | Count | Frequency (%) |
| 0 | 268 | |
| 1 | 178 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 446 |
| Value | Count | Frequency (%) |
| (unknown) | 446 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 0 | 272 | |
| 1 | 174 |
| Value | Count | Frequency (%) |
| 0 | 268 | |
| 1 | 178 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 446 |
| Value | Count | Frequency (%) |
| (unknown) | 446 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 0 | 272 | |
| 1 | 174 |
| Value | Count | Frequency (%) |
| 0 | 268 | |
| 1 | 178 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 446 |
| Value | Count | Frequency (%) |
| (unknown) | 446 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 0 | 272 | |
| 1 | 174 |
| Value | Count | Frequency (%) |
| 0 | 268 | |
| 1 | 178 |
Pclass
Categorical
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 3 | 3 |
| Distinct (%) | 0.7% | 0.7% |
| Missing | 0 | 0 |
| Missing (%) | 0.0% | 0.0% |
| Memory size | 7.0 KiB | 7.0 KiB |
| 3 | |
|---|---|
| 2 | |
| 1 |
| 3 | |
|---|---|
| 1 | |
| 2 |
Length
| Dataset A | Dataset B | |
|---|---|---|
| Max length | 1 | 1 |
| Median length | 1 | 1 |
| Mean length | 1 | 1 |
| Min length | 1 | 1 |
Characters and Unicode
| Dataset A | Dataset B | |
|---|---|---|
| Total characters | 446 | 446 |
| Distinct characters | 3 | 3 |
| Distinct categories | 1 | 1 ? |
| Distinct scripts | 1 | 1 ? |
| Distinct blocks | 1 | 1 ? |
Unique
| Dataset A | Dataset B | |
|---|---|---|
| Unique | 0 | 0 ? |
| Unique (%) | 0.0% | 0.0% |
Sample
| Dataset A | Dataset B | |
|---|---|---|
| 1st row | 1 | 1 |
| 2nd row | 3 | 1 |
| 3rd row | 1 | 2 |
| 4th row | 3 | 1 |
| 5th row | 2 | 3 |
Common Values
| Value | Count | Frequency (%) |
| 3 | 246 | |
| 2 | 102 | |
| 1 | 98 | 22.0% |
| Value | Count | Frequency (%) |
| 3 | 246 | |
| 1 | 110 | |
| 2 | 90 | 20.2% |
Length
Common Values (Plot)
Dataset A
Dataset B
| Value | Count | Frequency (%) |
| 3 | 246 | |
| 2 | 102 | |
| 1 | 98 | 22.0% |
| Value | Count | Frequency (%) |
| 3 | 246 | |
| 1 | 110 | |
| 2 | 90 | 20.2% |
Most occurring characters
| Value | Count | Frequency (%) |
| 3 | 246 | |
| 2 | 102 | |
| 1 | 98 | 22.0% |
| Value | Count | Frequency (%) |
| 3 | 246 | |
| 1 | 110 | |
| 2 | 90 | 20.2% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 446 |
| Value | Count | Frequency (%) |
| (unknown) | 446 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 3 | 246 | |
| 2 | 102 | |
| 1 | 98 | 22.0% |
| Value | Count | Frequency (%) |
| 3 | 246 | |
| 1 | 110 | |
| 2 | 90 | 20.2% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 446 |
| Value | Count | Frequency (%) |
| (unknown) | 446 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 3 | 246 | |
| 2 | 102 | |
| 1 | 98 | 22.0% |
| Value | Count | Frequency (%) |
| 3 | 246 | |
| 1 | 110 | |
| 2 | 90 | 20.2% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 446 |
| Value | Count | Frequency (%) |
| (unknown) | 446 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 3 | 246 | |
| 2 | 102 | |
| 1 | 98 | 22.0% |
| Value | Count | Frequency (%) |
| 3 | 246 | |
| 1 | 110 | |
| 2 | 90 | 20.2% |
Name
['Text', 'Text']
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 446 | 446 |
| Distinct (%) | 100.0% | 100.0% |
| Missing | 0 | 0 |
| Missing (%) | 0.0% | 0.0% |
| Memory size | 7.0 KiB | 7.0 KiB |
Length
| Dataset A | Dataset B | |
|---|---|---|
| Max length | 82 | 82 |
| Median length | 51 | 50 |
| Mean length | 26.79148 | 26.94843 |
| Min length | 12 | 12 |
Characters and Unicode
| Dataset A | Dataset B | |
|---|---|---|
| Total characters | 11949 | 12019 |
| Distinct characters | 59 | 59 |
| Distinct categories | 1 | 1 ? |
| Distinct scripts | 1 | 1 ? |
| Distinct blocks | 1 | 1 ? |
Unique
| Dataset A | Dataset B | |
|---|---|---|
| Unique | 446 | 446 ? |
| Unique (%) | 100.0% | 100.0% |
Sample
| Dataset A | Dataset B | |
|---|---|---|
| 1st row | Stewart, Mr. Albert A | Frolicher, Miss. Hedwig Margaritha |
| 2nd row | Niskanen, Mr. Juha | Swift, Mrs. Frederick Joel (Margaret Welles Barron) |
| 3rd row | Isham, Miss. Ann Elizabeth | Hodges, Mr. Henry Price |
| 4th row | Landergren, Miss. Aurora Adelia | Duff Gordon, Lady. (Lucille Christiana Sutherland) ("Mrs Morgan") |
| 5th row | Hold, Mr. Stephen | Garfirth, Mr. John |
| Value | Count | Frequency (%) |
| mr | 262 | 14.5% |
| miss | 84 | 4.6% |
| mrs | 61 | 3.4% |
| william | 36 | 2.0% |
| master | 26 | 1.4% |
| john | 23 | 1.3% |
| henry | 21 | 1.2% |
| charles | 15 | 0.8% |
| thomas | 14 | 0.8% |
| george | 11 | 0.6% |
| Other values (906) | 1258 |
| Value | Count | Frequency (%) |
| mr | 263 | 14.5% |
| miss | 88 | 4.8% |
| mrs | 67 | 3.7% |
| william | 28 | 1.5% |
| john | 21 | 1.2% |
| master | 20 | 1.1% |
| henry | 17 | 0.9% |
| charles | 14 | 0.8% |
| george | 13 | 0.7% |
| thomas | 11 | 0.6% |
| Other values (907) | 1273 |
Most occurring characters
| Value | Count | Frequency (%) |
| 1366 | 11.4% | |
| r | 967 | 8.1% |
| e | 846 | 7.1% |
| a | 803 | 6.7% |
| s | 658 | 5.5% |
| n | 653 | 5.5% |
| i | 643 | 5.4% |
| M | 544 | 4.6% |
| l | 528 | 4.4% |
| o | 522 | 4.4% |
| Other values (49) | 4419 |
| Value | Count | Frequency (%) |
| 1371 | 11.4% | |
| r | 1004 | 8.4% |
| e | 863 | 7.2% |
| a | 806 | 6.7% |
| i | 671 | 5.6% |
| s | 659 | 5.5% |
| n | 634 | 5.3% |
| M | 558 | 4.6% |
| l | 537 | 4.5% |
| o | 516 | 4.3% |
| Other values (49) | 4400 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 11949 |
| Value | Count | Frequency (%) |
| (unknown) | 12019 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 1366 | 11.4% | |
| r | 967 | 8.1% |
| e | 846 | 7.1% |
| a | 803 | 6.7% |
| s | 658 | 5.5% |
| n | 653 | 5.5% |
| i | 643 | 5.4% |
| M | 544 | 4.6% |
| l | 528 | 4.4% |
| o | 522 | 4.4% |
| Other values (49) | 4419 |
| Value | Count | Frequency (%) |
| 1371 | 11.4% | |
| r | 1004 | 8.4% |
| e | 863 | 7.2% |
| a | 806 | 6.7% |
| i | 671 | 5.6% |
| s | 659 | 5.5% |
| n | 634 | 5.3% |
| M | 558 | 4.6% |
| l | 537 | 4.5% |
| o | 516 | 4.3% |
| Other values (49) | 4400 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 11949 |
| Value | Count | Frequency (%) |
| (unknown) | 12019 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 1366 | 11.4% | |
| r | 967 | 8.1% |
| e | 846 | 7.1% |
| a | 803 | 6.7% |
| s | 658 | 5.5% |
| n | 653 | 5.5% |
| i | 643 | 5.4% |
| M | 544 | 4.6% |
| l | 528 | 4.4% |
| o | 522 | 4.4% |
| Other values (49) | 4419 |
| Value | Count | Frequency (%) |
| 1371 | 11.4% | |
| r | 1004 | 8.4% |
| e | 863 | 7.2% |
| a | 806 | 6.7% |
| i | 671 | 5.6% |
| s | 659 | 5.5% |
| n | 634 | 5.3% |
| M | 558 | 4.6% |
| l | 537 | 4.5% |
| o | 516 | 4.3% |
| Other values (49) | 4400 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 11949 |
| Value | Count | Frequency (%) |
| (unknown) | 12019 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 1366 | 11.4% | |
| r | 967 | 8.1% |
| e | 846 | 7.1% |
| a | 803 | 6.7% |
| s | 658 | 5.5% |
| n | 653 | 5.5% |
| i | 643 | 5.4% |
| M | 544 | 4.6% |
| l | 528 | 4.4% |
| o | 522 | 4.4% |
| Other values (49) | 4419 |
| Value | Count | Frequency (%) |
| 1371 | 11.4% | |
| r | 1004 | 8.4% |
| e | 863 | 7.2% |
| a | 806 | 6.7% |
| i | 671 | 5.6% |
| s | 659 | 5.5% |
| n | 634 | 5.3% |
| M | 558 | 4.6% |
| l | 537 | 4.5% |
| o | 516 | 4.3% |
| Other values (49) | 4400 |
Sex
Categorical
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 2 | 2 |
| Distinct (%) | 0.4% | 0.4% |
| Missing | 0 | 0 |
| Missing (%) | 0.0% | 0.0% |
| Memory size | 7.0 KiB | 7.0 KiB |
| male | |
|---|---|
| female |
| male | |
|---|---|
| female |
Length
| Dataset A | Dataset B | |
|---|---|---|
| Max length | 6 | 6 |
| Median length | 4 | 4 |
| Mean length | 4.6636771 | 4.7040359 |
| Min length | 4 | 4 |
Characters and Unicode
| Dataset A | Dataset B | |
|---|---|---|
| Total characters | 2080 | 2098 |
| Distinct characters | 5 | 5 |
| Distinct categories | 1 | 1 ? |
| Distinct scripts | 1 | 1 ? |
| Distinct blocks | 1 | 1 ? |
Unique
| Dataset A | Dataset B | |
|---|---|---|
| Unique | 0 | 0 ? |
| Unique (%) | 0.0% | 0.0% |
Sample
| Dataset A | Dataset B | |
|---|---|---|
| 1st row | male | female |
| 2nd row | male | female |
| 3rd row | female | male |
| 4th row | female | female |
| 5th row | male | male |
Common Values
| Value | Count | Frequency (%) |
| male | 298 | |
| female | 148 |
| Value | Count | Frequency (%) |
| male | 289 | |
| female | 157 |
Length
Common Values (Plot)
Dataset A
Dataset B
| Value | Count | Frequency (%) |
| male | 298 | |
| female | 148 |
| Value | Count | Frequency (%) |
| male | 289 | |
| female | 157 |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 594 | |
| m | 446 | |
| a | 446 | |
| l | 446 | |
| f | 148 | 7.1% |
| Value | Count | Frequency (%) |
| e | 603 | |
| m | 446 | |
| a | 446 | |
| l | 446 | |
| f | 157 | 7.5% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 2080 |
| Value | Count | Frequency (%) |
| (unknown) | 2098 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| e | 594 | |
| m | 446 | |
| a | 446 | |
| l | 446 | |
| f | 148 | 7.1% |
| Value | Count | Frequency (%) |
| e | 603 | |
| m | 446 | |
| a | 446 | |
| l | 446 | |
| f | 157 | 7.5% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 2080 |
| Value | Count | Frequency (%) |
| (unknown) | 2098 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| e | 594 | |
| m | 446 | |
| a | 446 | |
| l | 446 | |
| f | 148 | 7.1% |
| Value | Count | Frequency (%) |
| e | 603 | |
| m | 446 | |
| a | 446 | |
| l | 446 | |
| f | 157 | 7.5% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 2080 |
| Value | Count | Frequency (%) |
| (unknown) | 2098 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| e | 594 | |
| m | 446 | |
| a | 446 | |
| l | 446 | |
| f | 148 | 7.1% |
| Value | Count | Frequency (%) |
| e | 603 | |
| m | 446 | |
| a | 446 | |
| l | 446 | |
| f | 157 | 7.5% |
Age
Real number (ℝ)
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 74 | 69 |
| Distinct (%) | 21.2% | 19.4% |
| Missing | 97 | 90 |
| Missing (%) | 21.7% | 20.2% |
| Infinite | 0 | 0 |
| Infinite (%) | 0.0% | 0.0% |
| Mean | 29.086447 | 29.0075 |
| Dataset A | Dataset B | |
|---|---|---|
| Minimum | 0.42 | 0.75 |
| Maximum | 80 | 80 |
| Zeros | 0 | 0 |
| Zeros (%) | 0.0% | 0.0% |
| Negative | 0 | 0 |
| Negative (%) | 0.0% | 0.0% |
| Memory size | 7.0 KiB | 7.0 KiB |
Quantile statistics
| Dataset A | Dataset B | |
|---|---|---|
| Minimum | 0.42 | 0.75 |
| 5-th percentile | 4 | 4 |
| Q1 | 21 | 19 |
| median | 28 | 27 |
| Q3 | 37 | 38 |
| 95-th percentile | 53.6 | 54.25 |
| Maximum | 80 | 80 |
| Range | 79.58 | 79.25 |
| Interquartile range (IQR) | 16 | 19 |
Descriptive statistics
| Dataset A | Dataset B | |
|---|---|---|
| Standard deviation | 14.166877 | 14.131827 |
| Coefficient of variation (CV) | 0.48706112 | 0.48717838 |
| Kurtosis | 0.55558507 | 0.14541107 |
| Mean | 29.086447 | 29.0075 |
| Median Absolute Deviation (MAD) | 8 | 9 |
| Skewness | 0.4338516 | 0.38466988 |
| Sum | 10151.17 | 10326.67 |
| Variance | 200.70041 | 199.70853 |
| Monotonicity | Not monotonic | Not monotonic |
| Value | Count | Frequency (%) |
| 24 | 19 | 4.3% |
| 22 | 15 | 3.4% |
| 30 | 14 | 3.1% |
| 18 | 13 | 2.9% |
| 28 | 12 | 2.7% |
| 36 | 11 | 2.5% |
| 25 | 11 | 2.5% |
| 32 | 10 | 2.2% |
| 23 | 10 | 2.2% |
| 29 | 10 | 2.2% |
| Other values (64) | 224 | |
| (Missing) | 97 |
| Value | Count | Frequency (%) |
| 18 | 16 | 3.6% |
| 24 | 15 | 3.4% |
| 25 | 14 | 3.1% |
| 22 | 13 | 2.9% |
| 21 | 13 | 2.9% |
| 19 | 13 | 2.9% |
| 26 | 11 | 2.5% |
| 36 | 11 | 2.5% |
| 28 | 10 | 2.2% |
| 34 | 10 | 2.2% |
| Other values (59) | 230 | |
| (Missing) | 90 | 20.2% |
| Value | Count | Frequency (%) |
| 0.42 | 1 | 0.2% |
| 0.83 | 1 | 0.2% |
| 0.92 | 1 | 0.2% |
| 1 | 3 | |
| 2 | 3 | |
| 3 | 4 | |
| 4 | 6 | |
| 5 | 3 | |
| 6 | 3 | |
| 7 | 1 | 0.2% |
| Value | Count | Frequency (%) |
| 0.75 | 1 | 0.2% |
| 0.92 | 1 | 0.2% |
| 1 | 2 | 0.4% |
| 2 | 4 | |
| 3 | 4 | |
| 4 | 7 | |
| 5 | 3 | |
| 6 | 3 | |
| 8 | 2 | 0.4% |
| 9 | 4 |
| Value | Count | Frequency (%) |
| 0.75 | 1 | 0.2% |
| 0.92 | 1 | 0.2% |
| 1 | 2 | 0.4% |
| 2 | 4 | |
| 3 | 4 | |
| 4 | 7 | |
| 5 | 3 | |
| 6 | 3 | |
| 8 | 2 | 0.4% |
| 9 | 4 |
| Value | Count | Frequency (%) |
| 0.42 | 1 | 0.2% |
| 0.83 | 1 | 0.2% |
| 0.92 | 1 | 0.2% |
| 1 | 3 | |
| 2 | 3 | |
| 3 | 4 | |
| 4 | 6 | |
| 5 | 3 | |
| 6 | 3 | |
| 7 | 1 | 0.2% |
SibSp
Real number (ℝ)
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 7 | 7 |
| Distinct (%) | 1.6% | 1.6% |
| Missing | 0 | 0 |
| Missing (%) | 0.0% | 0.0% |
| Infinite | 0 | 0 |
| Infinite (%) | 0.0% | 0.0% |
| Mean | 0.56278027 | 0.47309417 |
| Dataset A | Dataset B | |
|---|---|---|
| Minimum | 0 | 0 |
| Maximum | 8 | 8 |
| Zeros | 299 | 315 |
| Zeros (%) | 67.0% | 70.6% |
| Negative | 0 | 0 |
| Negative (%) | 0.0% | 0.0% |
| Memory size | 7.0 KiB | 7.0 KiB |
Quantile statistics
| Dataset A | Dataset B | |
|---|---|---|
| Minimum | 0 | 0 |
| 5-th percentile | 0 | 0 |
| Q1 | 0 | 0 |
| median | 0 | 0 |
| Q3 | 1 | 1 |
| 95-th percentile | 3 | 2 |
| Maximum | 8 | 8 |
| Range | 8 | 8 |
| Interquartile range (IQR) | 1 | 1 |
Descriptive statistics
| Dataset A | Dataset B | |
|---|---|---|
| Standard deviation | 1.1971334 | 1.0113703 |
| Coefficient of variation (CV) | 2.1271773 | 2.1377781 |
| Kurtosis | 17.284527 | 17.054265 |
| Mean | 0.56278027 | 0.47309417 |
| Median Absolute Deviation (MAD) | 0 | 0 |
| Skewness | 3.7271473 | 3.5561665 |
| Sum | 251 | 211 |
| Variance | 1.4331284 | 1.02287 |
| Monotonicity | Not monotonic | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 299 | |
| 1 | 108 | 24.2% |
| 2 | 15 | 3.4% |
| 4 | 8 | 1.8% |
| 3 | 7 | 1.6% |
| 8 | 5 | 1.1% |
| 5 | 4 | 0.9% |
| Value | Count | Frequency (%) |
| 0 | 315 | |
| 1 | 97 | 21.7% |
| 2 | 12 | 2.7% |
| 4 | 10 | 2.2% |
| 3 | 8 | 1.8% |
| 5 | 2 | 0.4% |
| 8 | 2 | 0.4% |
| Value | Count | Frequency (%) |
| 0 | 299 | |
| 1 | 108 | 24.2% |
| 2 | 15 | 3.4% |
| 3 | 7 | 1.6% |
| 4 | 8 | 1.8% |
| 5 | 4 | 0.9% |
| 8 | 5 | 1.1% |
| Value | Count | Frequency (%) |
| 0 | 315 | |
| 1 | 97 | 21.7% |
| 2 | 12 | 2.7% |
| 3 | 8 | 1.8% |
| 4 | 10 | 2.2% |
| 5 | 2 | 0.4% |
| 8 | 2 | 0.4% |
| Value | Count | Frequency (%) |
| 0 | 315 | |
| 1 | 97 | 21.7% |
| 2 | 12 | 2.7% |
| 3 | 8 | 1.8% |
| 4 | 10 | 2.2% |
| 5 | 2 | 0.4% |
| 8 | 2 | 0.4% |
| Value | Count | Frequency (%) |
| 0 | 299 | |
| 1 | 108 | 24.2% |
| 2 | 15 | 3.4% |
| 3 | 7 | 1.6% |
| 4 | 8 | 1.8% |
| 5 | 4 | 0.9% |
| 8 | 5 | 1.1% |
Parch
Real number (ℝ)
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 6 | 6 |
| Distinct (%) | 1.3% | 1.3% |
| Missing | 0 | 0 |
| Missing (%) | 0.0% | 0.0% |
| Infinite | 0 | 0 |
| Infinite (%) | 0.0% | 0.0% |
| Mean | 0.35874439 | 0.41704036 |
| Dataset A | Dataset B | |
|---|---|---|
| Minimum | 0 | 0 |
| Maximum | 5 | 5 |
| Zeros | 342 | 327 |
| Zeros (%) | 76.7% | 73.3% |
| Negative | 0 | 0 |
| Negative (%) | 0.0% | 0.0% |
| Memory size | 7.0 KiB | 7.0 KiB |
Quantile statistics
| Dataset A | Dataset B | |
|---|---|---|
| Minimum | 0 | 0 |
| 5-th percentile | 0 | 0 |
| Q1 | 0 | 0 |
| median | 0 | 0 |
| Q3 | 0 | 1 |
| 95-th percentile | 2 | 2 |
| Maximum | 5 | 5 |
| Range | 5 | 5 |
| Interquartile range (IQR) | 0 | 1 |
Descriptive statistics
| Dataset A | Dataset B | |
|---|---|---|
| Standard deviation | 0.7474314 | 0.80508286 |
| Coefficient of variation (CV) | 2.083465 | 1.9304675 |
| Kurtosis | 8.0301305 | 7.2594478 |
| Mean | 0.35874439 | 0.41704036 |
| Median Absolute Deviation (MAD) | 0 | 0 |
| Skewness | 2.5184273 | 2.3739713 |
| Sum | 160 | 186 |
| Variance | 0.5586537 | 0.64815841 |
| Monotonicity | Not monotonic | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 342 | |
| 1 | 58 | 13.0% |
| 2 | 41 | 9.2% |
| 3 | 2 | 0.4% |
| 5 | 2 | 0.4% |
| 4 | 1 | 0.2% |
| Value | Count | Frequency (%) |
| 0 | 327 | |
| 1 | 65 | 14.6% |
| 2 | 48 | 10.8% |
| 5 | 3 | 0.7% |
| 3 | 2 | 0.4% |
| 4 | 1 | 0.2% |
| Value | Count | Frequency (%) |
| 0 | 342 | |
| 1 | 58 | 13.0% |
| 2 | 41 | 9.2% |
| 3 | 2 | 0.4% |
| 4 | 1 | 0.2% |
| 5 | 2 | 0.4% |
| Value | Count | Frequency (%) |
| 0 | 327 | |
| 1 | 65 | 14.6% |
| 2 | 48 | 10.8% |
| 3 | 2 | 0.4% |
| 4 | 1 | 0.2% |
| 5 | 3 | 0.7% |
| Value | Count | Frequency (%) |
| 0 | 327 | |
| 1 | 65 | 14.6% |
| 2 | 48 | 10.8% |
| 3 | 2 | 0.4% |
| 4 | 1 | 0.2% |
| 5 | 3 | 0.7% |
| Value | Count | Frequency (%) |
| 0 | 342 | |
| 1 | 58 | 13.0% |
| 2 | 41 | 9.2% |
| 3 | 2 | 0.4% |
| 4 | 1 | 0.2% |
| 5 | 2 | 0.4% |
Ticket
['Text', 'Text']
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 385 | 388 |
| Distinct (%) | 86.3% | 87.0% |
| Missing | 0 | 0 |
| Missing (%) | 0.0% | 0.0% |
| Memory size | 7.0 KiB | 7.0 KiB |
Length
| Dataset A | Dataset B | |
|---|---|---|
| Max length | 18 | 18 |
| Median length | 17 | 17 |
| Mean length | 6.793722 | 6.8475336 |
| Min length | 4 | 3 |
Characters and Unicode
| Dataset A | Dataset B | |
|---|---|---|
| Total characters | 3030 | 3054 |
| Distinct characters | 32 | 35 |
| Distinct categories | 1 | 1 ? |
| Distinct scripts | 1 | 1 ? |
| Distinct blocks | 1 | 1 ? |
Unique
| Dataset A | Dataset B | |
|---|---|---|
| Unique | 339 | 346 ? |
| Unique (%) | 76.0% | 77.6% |
Sample
| Dataset A | Dataset B | |
|---|---|---|
| 1st row | PC 17605 | 13568 |
| 2nd row | STON/O 2. 3101289 | 17466 |
| 3rd row | PC 17595 | 250643 |
| 4th row | C 7077 | 11755 |
| 5th row | 26707 | 358585 |
| Value | Count | Frequency (%) |
| pc | 32 | 5.5% |
| c.a | 18 | 3.1% |
| ca | 9 | 1.6% |
| a/5 | 9 | 1.6% |
| ston/o | 8 | 1.4% |
| 2 | 8 | 1.4% |
| c | 5 | 0.9% |
| 2343 | 5 | 0.9% |
| 1601 | 4 | 0.7% |
| 2144 | 4 | 0.7% |
| Other values (403) | 475 |
| Value | Count | Frequency (%) |
| pc | 25 | 4.4% |
| c.a | 16 | 2.8% |
| a/5 | 10 | 1.8% |
| w./c | 8 | 1.4% |
| ston/o | 7 | 1.2% |
| 2 | 7 | 1.2% |
| 347082 | 5 | 0.9% |
| sc/paris | 5 | 0.9% |
| ca | 4 | 0.7% |
| soton/o.q | 4 | 0.7% |
| Other values (410) | 480 |
Most occurring characters
| Value | Count | Frequency (%) |
| 3 | 374 | |
| 1 | 331 | |
| 2 | 301 | |
| 7 | 262 | |
| 4 | 230 | 7.6% |
| 6 | 208 | 6.9% |
| 5 | 198 | 6.5% |
| 0 | 196 | 6.5% |
| 9 | 160 | 5.3% |
| 8 | 142 | 4.7% |
| Other values (22) | 628 |
| Value | Count | Frequency (%) |
| 3 | 378 | |
| 1 | 355 | |
| 2 | 287 | |
| 7 | 241 | 7.9% |
| 4 | 230 | 7.5% |
| 0 | 215 | 7.0% |
| 6 | 211 | 6.9% |
| 5 | 196 | 6.4% |
| 9 | 165 | 5.4% |
| 8 | 131 | 4.3% |
| Other values (25) | 645 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 3030 |
| Value | Count | Frequency (%) |
| (unknown) | 3054 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 3 | 374 | |
| 1 | 331 | |
| 2 | 301 | |
| 7 | 262 | |
| 4 | 230 | 7.6% |
| 6 | 208 | 6.9% |
| 5 | 198 | 6.5% |
| 0 | 196 | 6.5% |
| 9 | 160 | 5.3% |
| 8 | 142 | 4.7% |
| Other values (22) | 628 |
| Value | Count | Frequency (%) |
| 3 | 378 | |
| 1 | 355 | |
| 2 | 287 | |
| 7 | 241 | 7.9% |
| 4 | 230 | 7.5% |
| 0 | 215 | 7.0% |
| 6 | 211 | 6.9% |
| 5 | 196 | 6.4% |
| 9 | 165 | 5.4% |
| 8 | 131 | 4.3% |
| Other values (25) | 645 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 3030 |
| Value | Count | Frequency (%) |
| (unknown) | 3054 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 3 | 374 | |
| 1 | 331 | |
| 2 | 301 | |
| 7 | 262 | |
| 4 | 230 | 7.6% |
| 6 | 208 | 6.9% |
| 5 | 198 | 6.5% |
| 0 | 196 | 6.5% |
| 9 | 160 | 5.3% |
| 8 | 142 | 4.7% |
| Other values (22) | 628 |
| Value | Count | Frequency (%) |
| 3 | 378 | |
| 1 | 355 | |
| 2 | 287 | |
| 7 | 241 | 7.9% |
| 4 | 230 | 7.5% |
| 0 | 215 | 7.0% |
| 6 | 211 | 6.9% |
| 5 | 196 | 6.4% |
| 9 | 165 | 5.4% |
| 8 | 131 | 4.3% |
| Other values (25) | 645 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 3030 |
| Value | Count | Frequency (%) |
| (unknown) | 3054 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 3 | 374 | |
| 1 | 331 | |
| 2 | 301 | |
| 7 | 262 | |
| 4 | 230 | 7.6% |
| 6 | 208 | 6.9% |
| 5 | 198 | 6.5% |
| 0 | 196 | 6.5% |
| 9 | 160 | 5.3% |
| 8 | 142 | 4.7% |
| Other values (22) | 628 |
| Value | Count | Frequency (%) |
| 3 | 378 | |
| 1 | 355 | |
| 2 | 287 | |
| 7 | 241 | 7.9% |
| 4 | 230 | 7.5% |
| 0 | 215 | 7.0% |
| 6 | 211 | 6.9% |
| 5 | 196 | 6.4% |
| 9 | 165 | 5.4% |
| 8 | 131 | 4.3% |
| Other values (25) | 645 |
Fare
Real number (ℝ)
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 182 | 184 |
| Distinct (%) | 40.8% | 41.3% |
| Missing | 0 | 0 |
| Missing (%) | 0.0% | 0.0% |
| Infinite | 0 | 0 |
| Infinite (%) | 0.0% | 0.0% |
| Mean | 31.936368 | 34.230558 |
| Dataset A | Dataset B | |
|---|---|---|
| Minimum | 0 | 0 |
| Maximum | 512.3292 | 512.3292 |
| Zeros | 10 | 8 |
| Zeros (%) | 2.2% | 1.8% |
| Negative | 0 | 0 |
| Negative (%) | 0.0% | 0.0% |
| Memory size | 7.0 KiB | 7.0 KiB |
Quantile statistics
| Dataset A | Dataset B | |
|---|---|---|
| Minimum | 0 | 0 |
| 5-th percentile | 7.162525 | 7.15 |
| Q1 | 7.8958 | 7.9031 |
| median | 14.25415 | 13.64585 |
| Q3 | 30.5 | 31.3875 |
| 95-th percentile | 110.8833 | 120 |
| Maximum | 512.3292 | 512.3292 |
| Range | 512.3292 | 512.3292 |
| Interquartile range (IQR) | 22.6042 | 23.4844 |
Descriptive statistics
| Dataset A | Dataset B | |
|---|---|---|
| Standard deviation | 50.112912 | 58.440025 |
| Coefficient of variation (CV) | 1.5691488 | 1.7072472 |
| Kurtosis | 39.850809 | 31.755854 |
| Mean | 31.936368 | 34.230558 |
| Median Absolute Deviation (MAD) | 6.74585 | 6.39585 |
| Skewness | 5.2112307 | 4.9139083 |
| Sum | 14243.62 | 15266.829 |
| Variance | 2511.304 | 3415.2365 |
| Monotonicity | Not monotonic | Not monotonic |
| Value | Count | Frequency (%) |
| 13 | 23 | 5.2% |
| 8.05 | 21 | 4.7% |
| 7.8958 | 19 | 4.3% |
| 26 | 16 | 3.6% |
| 7.925 | 12 | 2.7% |
| 7.775 | 11 | 2.5% |
| 7.75 | 11 | 2.5% |
| 10.5 | 11 | 2.5% |
| 0 | 10 | 2.2% |
| 7.25 | 8 | 1.8% |
| Other values (172) | 304 |
| Value | Count | Frequency (%) |
| 8.05 | 24 | 5.4% |
| 13 | 19 | 4.3% |
| 7.8958 | 17 | 3.8% |
| 26 | 14 | 3.1% |
| 10.5 | 14 | 3.1% |
| 7.75 | 13 | 2.9% |
| 7.925 | 12 | 2.7% |
| 7.2292 | 9 | 2.0% |
| 7.775 | 9 | 2.0% |
| 0 | 8 | 1.8% |
| Other values (174) | 307 |
| Value | Count | Frequency (%) |
| 0 | 10 | |
| 4.0125 | 1 | 0.2% |
| 6.4375 | 1 | 0.2% |
| 6.45 | 1 | 0.2% |
| 6.4958 | 2 | 0.4% |
| 6.95 | 1 | 0.2% |
| 7.0458 | 1 | 0.2% |
| 7.05 | 3 | 0.7% |
| 7.125 | 2 | 0.4% |
| 7.1417 | 1 | 0.2% |
| Value | Count | Frequency (%) |
| 0 | 8 | |
| 5 | 1 | 0.2% |
| 6.4375 | 1 | 0.2% |
| 6.45 | 1 | 0.2% |
| 6.4958 | 2 | 0.4% |
| 6.75 | 1 | 0.2% |
| 6.8583 | 1 | 0.2% |
| 7.05 | 4 | |
| 7.0542 | 2 | 0.4% |
| 7.125 | 2 | 0.4% |
| Value | Count | Frequency (%) |
| 0 | 8 | |
| 5 | 1 | 0.2% |
| 6.4375 | 1 | 0.2% |
| 6.45 | 1 | 0.2% |
| 6.4958 | 2 | 0.4% |
| 6.75 | 1 | 0.2% |
| 6.8583 | 1 | 0.2% |
| 7.05 | 4 | |
| 7.0542 | 2 | 0.4% |
| 7.125 | 2 | 0.4% |
| Value | Count | Frequency (%) |
| 0 | 10 | |
| 4.0125 | 1 | 0.2% |
| 6.4375 | 1 | 0.2% |
| 6.45 | 1 | 0.2% |
| 6.4958 | 2 | 0.4% |
| 6.95 | 1 | 0.2% |
| 7.0458 | 1 | 0.2% |
| 7.05 | 3 | 0.7% |
| 7.125 | 2 | 0.4% |
| 7.1417 | 1 | 0.2% |
Cabin
['Text', 'Text']
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 88 | 87 |
| Distinct (%) | 89.8% | 82.1% |
| Missing | 348 | 340 |
| Missing (%) | 78.0% | 76.2% |
| Memory size | 7.0 KiB | 7.0 KiB |
Length
| Dataset A | Dataset B | |
|---|---|---|
| Max length | 15 | 15 |
| Median length | 3 | 3 |
| Mean length | 3.4897959 | 3.7358491 |
| Min length | 1 | 1 |
Characters and Unicode
| Dataset A | Dataset B | |
|---|---|---|
| Total characters | 342 | 396 |
| Distinct characters | 18 | 18 |
| Distinct categories | 1 | 1 ? |
| Distinct scripts | 1 | 1 ? |
| Distinct blocks | 1 | 1 ? |
Unique
| Dataset A | Dataset B | |
|---|---|---|
| Unique | 78 | 70 ? |
| Unique (%) | 79.6% | 66.0% |
Sample
| Dataset A | Dataset B | |
|---|---|---|
| 1st row | C49 | B39 |
| 2nd row | A31 | D17 |
| 3rd row | E50 | A16 |
| 4th row | C62 C64 | B3 |
| 5th row | A5 | D26 |
| Value | Count | Frequency (%) |
| d | 2 | 1.8% |
| f4 | 2 | 1.8% |
| c92 | 2 | 1.8% |
| e67 | 2 | 1.8% |
| b98 | 2 | 1.8% |
| b96 | 2 | 1.8% |
| c68 | 2 | 1.8% |
| e24 | 2 | 1.8% |
| c65 | 2 | 1.8% |
| c124 | 2 | 1.8% |
| Other values (90) | 91 |
| Value | Count | Frequency (%) |
| d | 3 | 2.4% |
| e101 | 3 | 2.4% |
| c27 | 2 | 1.6% |
| b20 | 2 | 1.6% |
| g6 | 2 | 1.6% |
| c68 | 2 | 1.6% |
| f33 | 2 | 1.6% |
| b55 | 2 | 1.6% |
| b53 | 2 | 1.6% |
| b51 | 2 | 1.6% |
| Other values (89) | 105 |
Most occurring characters
| Value | Count | Frequency (%) |
| C | 39 | |
| 1 | 31 | 9.1% |
| B | 30 | 8.8% |
| 2 | 27 | 7.9% |
| 6 | 27 | 7.9% |
| 5 | 23 | 6.7% |
| 3 | 22 | 6.4% |
| 0 | 19 | 5.6% |
| 4 | 18 | 5.3% |
| 7 | 18 | 5.3% |
| Other values (8) | 88 |
| Value | Count | Frequency (%) |
| B | 44 | |
| 1 | 39 | 9.8% |
| 2 | 34 | 8.6% |
| 3 | 34 | 8.6% |
| 5 | 29 | 7.3% |
| C | 29 | 7.3% |
| 6 | 26 | 6.6% |
| 8 | 23 | 5.8% |
| 21 | 5.3% | |
| E | 18 | 4.5% |
| Other values (8) | 99 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 342 |
| Value | Count | Frequency (%) |
| (unknown) | 396 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| C | 39 | |
| 1 | 31 | 9.1% |
| B | 30 | 8.8% |
| 2 | 27 | 7.9% |
| 6 | 27 | 7.9% |
| 5 | 23 | 6.7% |
| 3 | 22 | 6.4% |
| 0 | 19 | 5.6% |
| 4 | 18 | 5.3% |
| 7 | 18 | 5.3% |
| Other values (8) | 88 |
| Value | Count | Frequency (%) |
| B | 44 | |
| 1 | 39 | 9.8% |
| 2 | 34 | 8.6% |
| 3 | 34 | 8.6% |
| 5 | 29 | 7.3% |
| C | 29 | 7.3% |
| 6 | 26 | 6.6% |
| 8 | 23 | 5.8% |
| 21 | 5.3% | |
| E | 18 | 4.5% |
| Other values (8) | 99 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 342 |
| Value | Count | Frequency (%) |
| (unknown) | 396 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| C | 39 | |
| 1 | 31 | 9.1% |
| B | 30 | 8.8% |
| 2 | 27 | 7.9% |
| 6 | 27 | 7.9% |
| 5 | 23 | 6.7% |
| 3 | 22 | 6.4% |
| 0 | 19 | 5.6% |
| 4 | 18 | 5.3% |
| 7 | 18 | 5.3% |
| Other values (8) | 88 |
| Value | Count | Frequency (%) |
| B | 44 | |
| 1 | 39 | 9.8% |
| 2 | 34 | 8.6% |
| 3 | 34 | 8.6% |
| 5 | 29 | 7.3% |
| C | 29 | 7.3% |
| 6 | 26 | 6.6% |
| 8 | 23 | 5.8% |
| 21 | 5.3% | |
| E | 18 | 4.5% |
| Other values (8) | 99 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 342 |
| Value | Count | Frequency (%) |
| (unknown) | 396 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| C | 39 | |
| 1 | 31 | 9.1% |
| B | 30 | 8.8% |
| 2 | 27 | 7.9% |
| 6 | 27 | 7.9% |
| 5 | 23 | 6.7% |
| 3 | 22 | 6.4% |
| 0 | 19 | 5.6% |
| 4 | 18 | 5.3% |
| 7 | 18 | 5.3% |
| Other values (8) | 88 |
| Value | Count | Frequency (%) |
| B | 44 | |
| 1 | 39 | 9.8% |
| 2 | 34 | 8.6% |
| 3 | 34 | 8.6% |
| 5 | 29 | 7.3% |
| C | 29 | 7.3% |
| 6 | 26 | 6.6% |
| 8 | 23 | 5.8% |
| 21 | 5.3% | |
| E | 18 | 4.5% |
| Other values (8) | 99 |
Embarked
Categorical
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 3 | 3 |
| Distinct (%) | 0.7% | 0.7% |
| Missing | 1 | 1 |
| Missing (%) | 0.2% | 0.2% |
| Memory size | 7.0 KiB | 7.0 KiB |
| S | |
|---|---|
| C | |
| Q |
| S | |
|---|---|
| C | |
| Q |
Length
| Dataset A | Dataset B | |
|---|---|---|
| Max length | 1 | 1 |
| Median length | 1 | 1 |
| Mean length | 1 | 1 |
| Min length | 1 | 1 |
Characters and Unicode
| Dataset A | Dataset B | |
|---|---|---|
| Total characters | 445 | 445 |
| Distinct characters | 3 | 3 |
| Distinct categories | 1 | 1 ? |
| Distinct scripts | 1 | 1 ? |
| Distinct blocks | 1 | 1 ? |
Unique
| Dataset A | Dataset B | |
|---|---|---|
| Unique | 0 | 0 ? |
| Unique (%) | 0.0% | 0.0% |
Sample
| Dataset A | Dataset B | |
|---|---|---|
| 1st row | C | C |
| 2nd row | S | S |
| 3rd row | C | S |
| 4th row | S | C |
| 5th row | S | S |
Common Values
| Value | Count | Frequency (%) |
| S | 325 | |
| C | 85 | 19.1% |
| Q | 35 | 7.8% |
| (Missing) | 1 | 0.2% |
| Value | Count | Frequency (%) |
| S | 331 | |
| C | 80 | 17.9% |
| Q | 34 | 7.6% |
| (Missing) | 1 | 0.2% |
Length
Common Values (Plot)
Dataset A
Dataset B
| Value | Count | Frequency (%) |
| s | 325 | |
| c | 85 | 19.1% |
| q | 35 | 7.9% |
| Value | Count | Frequency (%) |
| s | 331 | |
| c | 80 | 18.0% |
| q | 34 | 7.6% |
Most occurring characters
| Value | Count | Frequency (%) |
| S | 325 | |
| C | 85 | 19.1% |
| Q | 35 | 7.9% |
| Value | Count | Frequency (%) |
| S | 331 | |
| C | 80 | 18.0% |
| Q | 34 | 7.6% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 445 |
| Value | Count | Frequency (%) |
| (unknown) | 445 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| S | 325 | |
| C | 85 | 19.1% |
| Q | 35 | 7.9% |
| Value | Count | Frequency (%) |
| S | 331 | |
| C | 80 | 18.0% |
| Q | 34 | 7.6% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 445 |
| Value | Count | Frequency (%) |
| (unknown) | 445 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| S | 325 | |
| C | 85 | 19.1% |
| Q | 35 | 7.9% |
| Value | Count | Frequency (%) |
| S | 331 | |
| C | 80 | 18.0% |
| Q | 34 | 7.6% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 445 |
| Value | Count | Frequency (%) |
| (unknown) | 445 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| S | 325 | |
| C | 85 | 19.1% |
| Q | 35 | 7.9% |
| Value | Count | Frequency (%) |
| S | 331 | |
| C | 80 | 18.0% |
| Q | 34 | 7.6% |
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
| Age | Embarked | Fare | Parch | PassengerId | Pclass | Sex | SibSp | Survived | |
|---|---|---|---|---|---|---|---|---|---|
| Age | 1.000 | 0.000 | 0.128 | -0.327 | 0.074 | 0.307 | 0.000 | -0.152 | 0.177 |
| Embarked | 0.000 | 1.000 | 0.224 | 0.000 | 0.000 | 0.298 | 0.094 | 0.060 | 0.138 |
| Fare | 0.128 | 0.224 | 1.000 | 0.390 | 0.010 | 0.474 | 0.167 | 0.454 | 0.252 |
| Parch | -0.327 | 0.000 | 0.390 | 1.000 | -0.009 | 0.048 | 0.195 | 0.440 | 0.121 |
| PassengerId | 0.074 | 0.000 | 0.010 | -0.009 | 1.000 | 0.066 | 0.082 | -0.060 | 0.000 |
| Pclass | 0.307 | 0.298 | 0.474 | 0.048 | 0.066 | 1.000 | 0.166 | 0.176 | 0.327 |
| Sex | 0.000 | 0.094 | 0.167 | 0.195 | 0.082 | 0.166 | 1.000 | 0.094 | 0.513 |
| SibSp | -0.152 | 0.060 | 0.454 | 0.440 | -0.060 | 0.176 | 0.094 | 1.000 | 0.145 |
| Survived | 0.177 | 0.138 | 0.252 | 0.121 | 0.000 | 0.327 | 0.513 | 0.145 | 1.000 |
Dataset B
| Age | Embarked | Fare | Parch | PassengerId | Pclass | Sex | SibSp | Survived | |
|---|---|---|---|---|---|---|---|---|---|
| Age | 1.000 | 0.000 | 0.065 | -0.299 | 0.035 | 0.241 | 0.000 | -0.230 | 0.205 |
| Embarked | 0.000 | 1.000 | 0.211 | 0.056 | 0.109 | 0.246 | 0.125 | 0.092 | 0.132 |
| Fare | 0.065 | 0.211 | 1.000 | 0.461 | -0.046 | 0.473 | 0.162 | 0.424 | 0.278 |
| Parch | -0.299 | 0.056 | 0.461 | 1.000 | -0.015 | 0.053 | 0.232 | 0.447 | 0.217 |
| PassengerId | 0.035 | 0.109 | -0.046 | -0.015 | 1.000 | 0.000 | 0.069 | -0.089 | 0.123 |
| Pclass | 0.241 | 0.246 | 0.473 | 0.053 | 0.000 | 1.000 | 0.163 | 0.110 | 0.352 |
| Sex | 0.000 | 0.125 | 0.162 | 0.232 | 0.069 | 0.163 | 1.000 | 0.180 | 0.601 |
| SibSp | -0.230 | 0.092 | 0.424 | 0.447 | -0.089 | 0.110 | 0.180 | 1.000 | 0.151 |
| Survived | 0.205 | 0.132 | 0.278 | 0.217 | 0.123 | 0.352 | 0.601 | 0.151 | 1.000 |
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
| PassengerId | Survived | Pclass | Name | Sex | Age | SibSp | Parch | Ticket | Fare | Cabin | Embarked | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 64 | 65 | 0 | 1 | Stewart, Mr. Albert A | male | NaN | 0 | 0 | PC 17605 | 27.7208 | NaN | C |
| 400 | 401 | 1 | 3 | Niskanen, Mr. Juha | male | 39.0 | 0 | 0 | STON/O 2. 3101289 | 7.9250 | NaN | S |
| 177 | 178 | 0 | 1 | Isham, Miss. Ann Elizabeth | female | 50.0 | 0 | 0 | PC 17595 | 28.7125 | C49 | C |
| 376 | 377 | 1 | 3 | Landergren, Miss. Aurora Adelia | female | 22.0 | 0 | 0 | C 7077 | 7.2500 | NaN | S |
| 236 | 237 | 0 | 2 | Hold, Mr. Stephen | male | 44.0 | 1 | 0 | 26707 | 26.0000 | NaN | S |
| 865 | 866 | 1 | 2 | Bystrom, Mrs. (Karolina) | female | 42.0 | 0 | 0 | 236852 | 13.0000 | NaN | S |
| 838 | 839 | 1 | 3 | Chip, Mr. Chang | male | 32.0 | 0 | 0 | 1601 | 56.4958 | NaN | S |
| 350 | 351 | 0 | 3 | Odahl, Mr. Nils Martin | male | 23.0 | 0 | 0 | 7267 | 9.2250 | NaN | S |
| 212 | 213 | 0 | 3 | Perkin, Mr. John Henry | male | 22.0 | 0 | 0 | A/5 21174 | 7.2500 | NaN | S |
| 379 | 380 | 0 | 3 | Gustafsson, Mr. Karl Gideon | male | 19.0 | 0 | 0 | 347069 | 7.7750 | NaN | S |
Dataset B
| PassengerId | Survived | Pclass | Name | Sex | Age | SibSp | Parch | Ticket | Fare | Cabin | Embarked | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 539 | 540 | 1 | 1 | Frolicher, Miss. Hedwig Margaritha | female | 22.0 | 0 | 2 | 13568 | 49.5000 | B39 | C |
| 862 | 863 | 1 | 1 | Swift, Mrs. Frederick Joel (Margaret Welles Barron) | female | 48.0 | 0 | 0 | 17466 | 25.9292 | D17 | S |
| 723 | 724 | 0 | 2 | Hodges, Mr. Henry Price | male | 50.0 | 0 | 0 | 250643 | 13.0000 | NaN | S |
| 556 | 557 | 1 | 1 | Duff Gordon, Lady. (Lucille Christiana Sutherland) ("Mrs Morgan") | female | 48.0 | 1 | 0 | 11755 | 39.6000 | A16 | C |
| 760 | 761 | 0 | 3 | Garfirth, Mr. John | male | NaN | 0 | 0 | 358585 | 14.5000 | NaN | S |
| 424 | 425 | 0 | 3 | Rosblom, Mr. Viktor Richard | male | 18.0 | 1 | 1 | 370129 | 20.2125 | NaN | S |
| 816 | 817 | 0 | 3 | Heininen, Miss. Wendla Maria | female | 23.0 | 0 | 0 | STON/O2. 3101290 | 7.9250 | NaN | S |
| 233 | 234 | 1 | 3 | Asplund, Miss. Lillian Gertrud | female | 5.0 | 4 | 2 | 347077 | 31.3875 | NaN | S |
| 785 | 786 | 0 | 3 | Harmer, Mr. Abraham (David Lishin) | male | 25.0 | 0 | 0 | 374887 | 7.2500 | NaN | S |
| 596 | 597 | 1 | 2 | Leitch, Miss. Jessie Wills | female | NaN | 0 | 0 | 248727 | 33.0000 | NaN | S |
Dataset A
| PassengerId | Survived | Pclass | Name | Sex | Age | SibSp | Parch | Ticket | Fare | Cabin | Embarked | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 237 | 238 | 1 | 2 | Collyer, Miss. Marjorie "Lottie" | female | 8.0 | 0 | 2 | C.A. 31921 | 26.2500 | NaN | S |
| 423 | 424 | 0 | 3 | Danbom, Mrs. Ernst Gilbert (Anna Sigrid Maria Brogren) | female | 28.0 | 1 | 1 | 347080 | 14.4000 | NaN | S |
| 808 | 809 | 0 | 2 | Meyer, Mr. August | male | 39.0 | 0 | 0 | 248723 | 13.0000 | NaN | S |
| 217 | 218 | 0 | 2 | Jacobsohn, Mr. Sidney Samuel | male | 42.0 | 1 | 0 | 243847 | 27.0000 | NaN | S |
| 643 | 644 | 1 | 3 | Foo, Mr. Choong | male | NaN | 0 | 0 | 1601 | 56.4958 | NaN | S |
| 79 | 80 | 1 | 3 | Dowdell, Miss. Elizabeth | female | 30.0 | 0 | 0 | 364516 | 12.4750 | NaN | S |
| 95 | 96 | 0 | 3 | Shorney, Mr. Charles Joseph | male | NaN | 0 | 0 | 374910 | 8.0500 | NaN | S |
| 743 | 744 | 0 | 3 | McNamee, Mr. Neal | male | 24.0 | 1 | 0 | 376566 | 16.1000 | NaN | S |
| 662 | 663 | 0 | 1 | Colley, Mr. Edward Pomeroy | male | 47.0 | 0 | 0 | 5727 | 25.5875 | E58 | S |
| 594 | 595 | 0 | 2 | Chapman, Mr. John Henry | male | 37.0 | 1 | 0 | SC/AH 29037 | 26.0000 | NaN | S |
Dataset B
| PassengerId | Survived | Pclass | Name | Sex | Age | SibSp | Parch | Ticket | Fare | Cabin | Embarked | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 467 | 468 | 0 | 1 | Smart, Mr. John Montgomery | male | 56.0 | 0 | 0 | 113792 | 26.5500 | NaN | S |
| 216 | 217 | 1 | 3 | Honkanen, Miss. Eliina | female | 27.0 | 0 | 0 | STON/O2. 3101283 | 7.9250 | NaN | S |
| 662 | 663 | 0 | 1 | Colley, Mr. Edward Pomeroy | male | 47.0 | 0 | 0 | 5727 | 25.5875 | E58 | S |
| 81 | 82 | 1 | 3 | Sheerlinck, Mr. Jan Baptist | male | 29.0 | 0 | 0 | 345779 | 9.5000 | NaN | S |
| 250 | 251 | 0 | 3 | Reed, Mr. James George | male | NaN | 0 | 0 | 362316 | 7.2500 | NaN | S |
| 160 | 161 | 0 | 3 | Cribb, Mr. John Hatfield | male | 44.0 | 0 | 1 | 371362 | 16.1000 | NaN | S |
| 580 | 581 | 1 | 2 | Christy, Miss. Julie Rachel | female | 25.0 | 1 | 1 | 237789 | 30.0000 | NaN | S |
| 619 | 620 | 0 | 2 | Gavey, Mr. Lawrence | male | 26.0 | 0 | 0 | 31028 | 10.5000 | NaN | S |
| 0 | 1 | 0 | 3 | Braund, Mr. Owen Harris | male | 22.0 | 1 | 0 | A/5 21171 | 7.2500 | NaN | S |
| 878 | 879 | 0 | 3 | Laleff, Mr. Kristo | male | NaN | 0 | 0 | 349217 | 7.8958 | NaN | S |
Dataset A
| PassengerId | Survived | Pclass | Name | Sex | Age | SibSp | Parch | Ticket | Fare | Cabin | Embarked | # duplicates | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Dataset does not contain duplicate rows. | |||||||||||||
Dataset B
| PassengerId | Survived | Pclass | Name | Sex | Age | SibSp | Parch | Ticket | Fare | Cabin | Embarked | # duplicates | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Dataset does not contain duplicate rows. | |||||||||||||